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1  Introduction 

Eucalyptus  and  InterLACE  are  two  projects  inte¬ 
grating  spoken  natural  language  interaction  with  the 
already-developed  graphical  user  interfaces  of  cur¬ 
rently  existing  military  simulation  systems.  We  take 
this  approach  (rather  than  developing  new,  fully  in¬ 
tegrated  systems  from  scratch)  to  facilitate  the  de¬ 
velopment  of  Navy  demonstrations  of  NL  technol¬ 
ogy  applied  to  their  own  tools,  as  well  as  to  explore 
the  relative  strengths  and  weaknesses  of  graphical 
vs.  NL  interfaces  and  how  the  two  can  be  used  in 
combination  to  yield  more  powerful  interaction  tech¬ 
niques.  The  natural  language  processing  capability 
of  the  two  systems  is  described  in  an  accompanying 
Proceedings  technical  note  (Wauchope  et  al.  1997); 
here  we  go  into  a  bit  more  detail  on  the  issues  of 
multimodal  integration  with  graphical  interfaces. 

2  Eucalyptus 

Our  first  venture  into  multimodal  interface  devel¬ 
opment  was  Eucalyptus,  a  spoken  natural  language 
interface  integrated  with  the  graphical  user  inter¬ 
face  of  the  KOALAS  AEW  Test  Planning  Tool,  a 
simulation-based  mockup  of  a  hypothetical  Naval  air 
combat  command  and  control  system.  Our  objective 
was  to  have  the  functionality  of  the  NL  interface  du¬ 
plicate  and  parallel  that  of  the  GUI,  allowing  com¬ 
parison  of  language-based  and  graphical  approaches 
to  predication  and  reference.  We  also  sought  for  the 
two  modalities  to  interact  in  a  number  of  ways,  such 
as  deictic  reference  (combined  pointing  and  speak¬ 
ing)  and  NL  interaction  with  graphical  dialog  win¬ 
dows. 

2.1  Guidelines 

We  adhered  to  the  following  guidelines  and  ap¬ 
proaches  in  developing  Eucalyptus. 

•  Spoken  inputs  are  normative  English  utterances 
only,  not  just  arbitrary  individual  words  or 


phrases.  Such  utterances  can  include  certain 
elliptical  or  fragmentary  forms,  but  only  those 
that  people  might  normally  use  with  each  other 
under  similar  circumstances.  As  a  result  Euca¬ 
lyptus  is  a  spoken  NL  system,  not  just  a  voice 
command  system.  This  is  not  to  say  that  voice 
commands  are  undesirable  or  that  people  pre¬ 
fer  to  use  conversational  language  when  talking 
to  a  machine,  but  just  represents  our  particular 
research  interests. 

•  We  address  NL  input  only,  not  output.  The 
NAUTILUS  NLP  processor  does  try  to  gener¬ 
ate  brief  and  helpful  English  sentence  fragments 
in  response  to  user  queries,  but  this  is  deliber¬ 
ately  underdeveloped  since  we  also  have  ongo¬ 
ing  work  in  adapting  a  full-fledged  NL  gener¬ 
ation  system  for  full-sentence  query  responses 
(integrated  into  our  second  demo  system  Inter¬ 
LACE,  to  be  discussed  later). 

•  We  are  careful  not  to  provide  the  NL  interface 
with  any  new  domain  functionality  not  already 
available  in  some  way  from  the  graphical  inter¬ 
face.  This  ensures  that  any  comparisons  be¬ 
tween  the  two  will  be  based  strictly  on  syntax 
and  convenience  of  use. 

•  We  disallow  “metalinguistic”  NL  references  to 
the  graphical  interface  itself  (e.g.  Open  the 
advice  explanation  window,  Push  the  ’Move 
Fighter’  button)  in  favor  of  direct  references  to 
the  task  domain  {Show  the  current  advice,  Move 
a  fighter).  This  avoids  the  user  treating  the 
graphical  interface  as  primary  and  speech  as  a 
subservient  adjunct,  a  kind  of  “vocal  mouse”  as 
it  were.  Again  this  is  not  to  deny  that  speech 
control  of  a  GUI  might  be  useful  for  handi¬ 
capped  individuals  or  in  extremely  hands-busy 
situations,  but  just  encourages  the  user’s  per¬ 
ception  of  the  two  interfaces  as  parallel  and 
equal  partners. 
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2.2  Multimodal  Integration 

To  issue  an  order  to  a  simulated  fighter  aircraft  using 
the  KOALAS  graphical  interface,  the  user  first  clicks 
a  button  labelled  with  the  order  type  (e.g.  Move,  Re¬ 
call)  which  pops  up  a  dialog  window  requesting  that 
the  user  select  the  desired  objects  by  mousing  them 
on  the  radar  screen,  echoing  the  objects’  names  in 
text  fields  as  they  are  selected,  and  finally  requesting 
an  accept/quit  decision.  The  Move  Fighter  dialog 
box,  for  example,  prompts  first  for  the  fighter  and 
then  for  the  new  patrol  station  (optional  parameters 
like  speed  are  not  prompted  for  but  can  be  adjusted 
at  any  time). 

This  rather  specialized  dialog  box  behavior  is 
distinctly  language-like  in  two  ways:  the  resulting 
graphical  syntax  (Move,  fighter,  station)  closely  re¬ 
sembles  NL  imperative  syntax  (Move  this  fighter  to 
this  station),  and  the  prompting  sequence  resembles 
a  natural  language  dialog.  As  a  result  we  elected 
to  use  KOALAS  dialog  boxes  both  for  accept/quit 
confirmation  of  completely  specified  NL  commands 
(Move  fighter  1  to  station  4),  as  well  as  to  prompt 
for  remaining  unspecified  arguments  (Move  fighter 
1).  In  either  case,  speech  can  also  be  used  to  dis¬ 
miss  the  dialog  (Okay,  Cancel  that)  as  well  as  to 
provide  remaining  arguments  to  the  dialog  (one  of 
the  fighters  holding  station  4)- 

The  second  form  of  multimodal  integration  in  Eu¬ 
calyptus  is  through  deictic  reference:  phrases  like 
this  fighter  accompanied  by  a  mouse  click.  Since 
mouse  clicks  in  the  KOALAS  graphical  interface  do 
not  create  a  “current  selected  object”  but  rather  re¬ 
port  the  closest  object  of  the  type  being  prompted 
for  by  the  current  dialog  box  prompt,  this  behav¬ 
ior  was  easy  to  modify  for  deictic  reference:  the 
mouse  click  identifies  a  set  of  possible  objects  and 
the  accompanying  NL  context  (this  fighter.  Move 
this  here)  provides  the  semantic  presuppositions  to 
disambiguate  the  denoted  object. 

Finally,  we  allow  purely  graphical  interactions  to 
have  elliptical  or  anaphoric  NL  followups.  For  exam¬ 
ple  after  completing  a  fighter  move  using  the  GUI, 
the  NL  inputs  Same  with  fighter  2  or  Move  fighter  3 
there  also  will  be  properly  understood. 

3  InterLACE 

InterLACE  is  a  multimodal  interface  to  the  Air 
Force’s  LACE  land/air  combat  simulation  sys¬ 
tem,  containing  an  extensive  real-world  cartographic 
database  of  central  Germany.  Unlike  KOALAS, 
LACE  came  to  us  with  no  graphical  interface,  al¬ 
though  all  the  “hooks”  were  in  place  to  implement 
a  map  display  with  animation  of  simulated  objects. 


Since  with  LACE  there  was  no  graphical  com¬ 
mand  interface  to  mirror  in  natural  language,  we 
opted  instead  to  focus  on  database  query  (also  in¬ 
cluded  in  Eucalyptus)  and  the  issuing  of  verbal  on¬ 
road  route  instructions  to  a  simulated  tank  unit.  De¬ 
ictic  reference  was  implemented  similary  to  Eucalyp¬ 
tus,  the  difference  being  that  we  chose  to  implement 
the  conventional  notion  of  “current  selected  object” : 
objects  clicked  by  the  mouse  are  highlighted  and  re¬ 
main  so  until  deselected.  The  user  can  also  select 
a  set  of  closely  adjacent  objects  with  a  double-click 
and  then,  as  in  Eucalyptus,  use  verbal  context  (this 
town.  What’s  the  population  here?)  to  resolve  the 
reference. 

3.1  Database  Query 

Because  of  the  large  size  of  the  database  (over  twelve 
thousand  objects)  and  the  need  to  reduce  the  search 
space  during  information  retrieval,  we  elected  to  con¬ 
strain  the  domain  of  NL  discourse  to  only  those 
objects  currently  visible  in  the  map  display,  repre¬ 
senting  about  one-eighth  of  the  entire  database  at 
any  one  time.  Hence  Does  this  road  cross  any  small 
towns?  only  reports  back  towns  that  are  currently 
visible  in  the  display,  based  on  the  assumption  that 
this  is  the  user’s  current  focus  of  attention  (this  con¬ 
straint  applies  only  to  quantified  NPs  and  not  proper 
names,  i.e.  Does  this  road  cross  Berlin?  does  not  re¬ 
quire  that  Berlin  be  visible). 

3.2  Tank  Commands 

LACE  provides  for  the  instantiation  of  simulated 
ground  vehicles  (such  as  tank  units  and  SAM  missile 
carriers)  and  includes  a  number  of  routines  for  on¬ 
road  navigation,  including  a  route  planner  for  com¬ 
puting  paths  between  towns.  Verbal  instructions  to 
the  tank  giving  just  a  destination  (Go  to  the  near¬ 
est  small  town)  invoke  the  route  planner,  but  since 
the  planner  does  not  accept  specification  of  means 
for  getting  to  the  destination,  commands  like  Head 
north  on  road  E2  for  5  km  instead  invoke  an  In¬ 
terLACE  function  called  the  “stretch  selector”  that 
tries  to  find  a  single  stretch  of  road  (a  sequence  of 
road  segments  between  two  adjacent  linkage  points) 
that  reconciles  every  component  of  the  means.  The 
tank  understands  both  cardinal  (northeast)  and  rel¬ 
ative  directions  (left),  and  in  ambiguous  situations 
chooses  the  stretch  that  bears  the  most  closely  in 
the  specified  direction. 
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